Aperiodicity Analysis for Quality Estimation of Text-to-Speech Signals
نویسندگان
چکیده
This contribution presents a new approach towards nonintrusive quality assessment of text-to-speech (TTS) signals. Perturbation measures which capture the degree of excitationspecific aperiodicity in voiced speech are investigated concerning their quality implications in synthesized speech. Based on two independent TTS databases for which formal attributebased listening tests have been conducted, we show that perturbation measures are sensitive to quality aspects of prosody and voice characteristic. Furthermore a dominant dependency on TTS type, namely non-uniform unit-selection and diphone synthesis, is identified. Yet, considerable differences between male and female TTS samples are recognized, emphasizing the need for gender-specific quality assessment.
منابع مشابه
Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
A new control paradigm of source signals for high quality speech synthesis is introduced to handle a variety of speech quality, based on timefrequency analyses by the use of an instantaneous frequency and group delay. The proposed signal representation consists of a frequency domain aperiodicity measure and a time domain energy concentration measure to represent source attributes, which supplem...
متن کاملUsing instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis
This paper introduces a general and flexible framework for F0 and aperiodicity (additive non periodic component) analysis, specifically intended for high-quality speech synthesis and modification applications. The proposed framework consists of three subsystems: instantaneous frequency estimator and initial aperiodicity detector, F0 trajectory tracker, and F0 refinement and aperiodicity extract...
متن کاملD4C, a band-aperiodicity estimator for high-quality speech synthesis
An algorithm is proposed for estimating the band aperiodicity of speech signals, where “aperiodicity” is defined as the power ratio between the speech signal and the aperiodic component of the signal. Since this power ratio depends on the frequency band, the aperiodicity should be given for several frequency bands. The proposed D4C (Definitive Decomposition Derived Dirt-Cheap) estimator is base...
متن کاملAperiodicity at Topic Structure Boundaries - Zellers & Post SP10 revised
Topic structure in longer discourses has been shown to be marked in speech by prosodic variations, e.g. variations in fundamental frequency (F0) and speech rate. We investigated whether variations in voice quality, specifically aperiodicity as an aspect of glottalization, were also signals to topic structure by varying to indicate the strength of discourse boundaries. We found that variation in...
متن کاملAperiodicity control in ARX-based speech analysis-synthesis method
We present an improved algorithm for a robust speech analysissynthesis method based on an auto-regressive with exogenous input (ARX) speech production model proposed previously. The speech analysis-synthesis method is capable of making an automatic estimation of vocal tract (formant) and voice source parameters from a speech utterance, generating accurate formant values even for very high-pitch...
متن کامل